141,831 research outputs found

    mockrobiota: a Public Resource for Microbiome Bioinformatics Benchmarking.

    Get PDF
    Mock communities are an important tool for validating, optimizing, and comparing bioinformatics methods for microbial community analysis. We present mockrobiota, a public resource for sharing, validating, and documenting mock community data resources, available at http://caporaso-lab.github.io/mockrobiota/. The materials contained in mockrobiota include data set and sample metadata, expected composition data (taxonomy or gene annotations or reference sequences for mock community members), and links to raw data (e.g., raw sequence data) for each mock community data set. mockrobiota does not supply physical sample materials directly, but the data set metadata included for each mock community indicate whether physical sample materials are available. At the time of this writing, mockrobiota contains 11 mock community data sets with known species compositions, including bacterial, archaeal, and eukaryotic mock communities, analyzed by high-throughput marker gene sequencing. IMPORTANCE The availability of standard and public mock community data will facilitate ongoing method optimizations, comparisons across studies that share source data, and greater transparency and access and eliminate redundancy. These are also valuable resources for bioinformatics teaching and training. This dynamic resource is intended to expand and evolve to meet the changing needs of the omics community

    Bioinformatics Needs Assessment

    Get PDF
    An assessment of the Bioinformatics Program at MIT Libraries was conducted using quantitative and qualitative data collection methods during FY13-14. Interviews were conducted to gain insight about bioinformatics researcher’s needs and behaviors and insight about the bioinformatics support offered by the MIT Libraries. Data was collected from various services of the bioinformatics program as well as from other library services. The assessment found that the bioinformatics community is interdisciplinary and crosses traditional life science departmental boundaries. The bioinformatics community takes a collaborative do-it-yourself (DIY) approach to computational skills and analytical tools –if they don’t know something or have something to use, they find someone who does or they build it themselves. Themes from the assessment emerged such as computational skills, tools, data, instruction and interdisciplinarity. The bioinformatics community has a desire for computational skills and modular training. The MIT Libraries bioinformatics training sessions are well attended; training sessions taught by experts are popular. Recommendations for the Bioinformatics Program at MIT Libraries include being more aware of open source software tools used by the community, attempting to expand the use of commercial tools in courses, and expanding outreach and advocacy regarding bioinformatics to the entire MIT community

    Short Courses: Flexible Learning Opportunities in Informatics

    Get PDF
    In today’s fast-paced, data-driven world, researchers need to have a good foundation in informatics to store, organize, process, and analyze growing amounts of data. However, not all degree programs offer such training. Obtaining training in informatics on your own can be a daunting task for both new and established researchers who have little informatics experience. Providing educational opportunities appropriate for various skill levels and that mesh with a full-time schedule can remove barriers and foster a collaborative, informatics-savvy community that is better equipped to push science forward. To enhance informatics education in bioinformatics, VCUs Wright Center for Clinical and Translational Research of- fers a complementary series of seminars and workshops. These short course offerings introduce attendees to bioinformatics concepts and applications, and provide hands-on experience using online Bioinformatics databases. Bioinformatics 101 (B101) is an 8-week long series of 1-hour seminars focused on introducing topics in bioinformatics related to Next Generation Sequencing (NGS). Lectures are application focused and include overviews of NGS technology, practical bioinformatics pipelines, and examples of how the technology can influence downstream bioinformatics analyses. Bioinformatics 102 (B102) is a 5-day, 2 hours per day workshop developed in collaboration with VCU Libraries that provides attendees with hands-on experience accessing and using public data repositories. Sessions include a brief lecture followed by hands-on exercises. A Certificate of Completion is awarded upon meeting certain criteria for either the 101 or 102 courses. Bioinformatics 101 has been offered 3 times with a combined total of 246 registrants, and Bioinformatics 102 has been offered twice with a total of 78 registrants (limited to 30 per session per day). From course surveys, 82% (n=108) and 95% (n=47) of respondents gave B101 and B102 a positive rating, respectively. In addition, 89% of B101 respondents indicated their knowledge was improved, with 100% of B102 respondents indicating the same. A total of 84 and 33 certificates have been awarded for B101 and B102, respectively. The Bioinformatics 101 and 102 courses have become highly anticipated across the university, and have gained the external attention of surrounding businesses and colleges. Registrants have diverse backgrounds including biological, clinical, computational, administrative, librarian, business, and others with a total of 77 departments across VCU and VCU Health represented. Due to this interest, Bioinformatics 101 began offering live online attendance to accommodate those who were unable to travel across campus, or who are attending from outside VCU. This past year, 50% of attendance was online indicating a growing need for flexible education opportunities in informatics. Increasing researcher knowledge of Bioinformatics along with awareness of university resources for informatics support fosters an informatics-savvy research community that is empowered to take advantage of existing and new data sources in the pursuit of new insights and scientific discoveries for the betterment of human health. Future work will include the development of a more comprehensive educational framework by creating new and flexible learning opportunities that will make informatics education easy and convenient for our dedicated researchers

    Prospects and limitations of full-text index structures in genome analysis

    Get PDF
    The combination of incessant advances in sequencing technology producing large amounts of data and innovative bioinformatics approaches, designed to cope with this data flood, has led to new interesting results in the life sciences. Given the magnitude of sequence data to be processed, many bioinformatics tools rely on efficient solutions to a variety of complex string problems. These solutions include fast heuristic algorithms and advanced data structures, generally referred to as index structures. Although the importance of index structures is generally known to the bioinformatics community, the design and potency of these data structures, as well as their properties and limitations, are less understood. Moreover, the last decade has seen a boom in the number of variant index structures featuring complex and diverse memory-time trade-offs. This article brings a comprehensive state-of-the-art overview of the most popular index structures and their recently developed variants. Their features, interrelationships, the trade-offs they impose, but also their practical limitations, are explained and compared

    Lessons Learned: Recommendations for Establishing Critical Periodic Scientific Benchmarking

    Get PDF
    The dependence of life scientists on software has steadily grown in recent years. For many tasks, researchers have to decide which of the available bioinformatics software are more suitable for their specific needs. Additionally researchers should be able to objectively select the software that provides the highest accuracy, the best efficiency and the highest level of reproducibility when integrated in their research projects. Critical benchmarking of bioinformatics methods, tools and web services is therefore an essential community service, as well as a critical component of reproducibility efforts. Unbiased and objective evaluations are challenging to set up and can only be effective when built and implemented around community driven efforts, as demonstrated by the many ongoing community challenges in bioinformatics that followed the success of CASP. Community challenges bring the combined benefits of intense collaboration, transparency and standard harmonization. Only open systems for the continuous evaluation of methods offer a perfect complement to community challenges, offering to larger communities of users that could extend far beyond the community of developers, a window to the developments status that they can use for their specific projects. We understand by continuous evaluation systems as those services which are always available and periodically update their data and/or metrics according to a predefined schedule keeping in mind that the performance has to be always seen in terms of each research domain. We argue here that technology is now mature to bring community driven benchmarking efforts to a higher level that should allow effective interoperability of benchmarks across related methods. New technological developments allow overcoming the limitations of the first experiences on online benchmarking e.g. EVA. We therefore describe OpenEBench, a novel infra-structure designed to establish a continuous automated benchmarking system for bioinformatics methods, tools and web services. OpenEBench is being developed so as to cater for the needs of the bioinformatics community, especially software developers who need an objective and quantitative way to inform their decisions as well as the larger community of end-users, in their search for unbiased and up-to-date evaluation of bioinformatics methods. As such OpenEBench should soon become a central place for bioinformatics software developers, community-driven benchmarking initiatives, researchers using bioinformatics methods, and funders interested in the result of methods evaluation.Preprin
    corecore